-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Support tags in LuceneCountOperator
#133479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adds support for "tagged queries" to the `LuceneCountOperator`. `LuceneCountOperator` is the Lucene native implementation for queries like: ``` FROM foo | STATS COUNT(*) ``` This is something we can often implement very quickly using Lucene's statistics. It's also used for: ``` FROM foo | WHERE a > 10 | STATS COUNT(*) ``` Here we can't use statistics, but Lucene's queries have `count` methods on them that *can* be very very fast because they use pre-calculated statistics. For example, the filter cache stores the number of hits and `count` method runs in O(1) time. And we need this Operator to use it. "Tagged queries" support means we should be able to use this same operator for cases like: ``` FROM foo | STATS COUNT(*) BY DATE_TRUNC(1 DAY, @timestamp) ``` This doesn't plug that in to the query planner, but we should be able to do so after this PR. Which would bring us to parity with agg ala: https://www.elastic.co/blog/how-we-made-date-histogram-aggregations-faster-than-ever-in-elasticsearch-7-11
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
|
||
| /** Returns a deep copy of the given block, using the blockFactory for creating the copy block. */ | ||
| public static Block deepCopyOf(Block block, BlockFactory blockFactory) { | ||
| // TODO preserve constants here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to do this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Nik!
| } | ||
| } | ||
|
|
||
| private List<ElementType> tagTypes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: final for tagTypes and tagsToState?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| } | ||
|
|
||
| private Page buildNonConstantBlocksResult() { | ||
| BlockUtils.BuilderWrapper[] builders = new BlockUtils.BuilderWrapper[1 + tagTypes.size()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be tagTypes.size() instead of 1 + tagTypes.size()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. checking.
|
Now that #133510 is in I'm going to update the test using it. |
Adds support for "tagged queries" to the
LuceneCountOperator.LuceneCountOperatoris the Lucene native implementation for queries like:This is something we can often implement very quickly using Lucene's statistics. It's also used for:
Here we can't use statistics, but Lucene's queries have
countmethods on them that can be very very fast because they use pre-calculated statistics. For example, the filter cache stores the number of hits andcountmethod runs in O(1) time. And we need this Operator to use it."Tagged queries" support means we should be able to use this same operator for cases like:
This doesn't plug that in to the query planner, but we should be able to do so after this PR. Which would bring us to parity with agg ala: https://www.elastic.co/blog/how-we-made-date-histogram-aggregations-faster-than-ever-in-elasticsearch-7-11